National Repository of Grey Literature 24 records found  1 - 10nextend  jump to record: Search took 0.01 seconds. 
Distributed Processing of IP flow Data
Krobot, Pavel ; Kořenek, Jan (referee) ; Žádník, Martin (advisor)
This thesis deals with the subject of distributed processing of IP flow. Main goal is to provide an implementation of a software collector which allows storing and processing huge amount of a network data in particular. There was studied an open-source implementation of a framework for the distributed processing of large data sets called Hadoop, which is based on MapReduce paradigm. There were made some experiments with this system which provided the comparison with the current systems and shown weaknesses of this framework. Based on this knowledge there was created a specification and scheme for an extension of current software collector within this work. In terms of the created scheme there was created an implementation of query framework for formed collector, which is considered as most critical in the field of distributed processing of IP flow data. Results of experiments with created implementation show significant performance growth and ability of linear scalability with some types of queries.
BigData Approach to Management of Large Netflow Datasets
Melkes, Miloslav ; Ráb, Jaroslav (referee) ; Ryšavý, Ondřej (advisor)
This master‘s thesis focuses on distributed processing of big data from network communication. It begins with exploring network communication based on TCP/IP model with focus on data units on each layer, which is necessary to process during analyzation. In terms of the actual processing of big data is described programming model MapReduce, architecture of Apache Hadoop technology and it‘s usage for processing network flows on computer cluster. Second part of this thesis deals with design and following implementation of the application for processing network flows from network communication. In this part are discussed main and problematic parts from the actual implementation. After that this thesis ends with a comparison with available applications for network analysis and evaluation set of tests which confirmed linear growth of acceleration.
Automatic Image Labelling
Lukáč, Michal ; Řezníček, Ivo (referee) ; Hradiš, Michal (advisor)
This thesis focuses on automatic image labelling to semantic categories. It describes the theory of classif cation and local features detection. It explains fundamental machine learning models used for image tagging, and how such models can be learned with Gradient descent. It propose solution with hierarchy for ImageNet and tagging images with attributes. MapReduce computing model is considered for learning on big data sets. In the last part it is described implementation, experimental and test results.
Web Page Visitor Monitoring
Jelič, Martin ; Očenášek, Pavel (referee) ; Burget, Radek (advisor)
This Bachelor's Thesis deals with web analytics, its terms, principles, related problems and their solutions. There are described at large a few tools for web analytics. The focus of this work is the design and implementation of a new web analytics tool that allows to monitor the web page traffic and evaluate the obtained data for internet project management purposes. There are shown the results of testing the mentioned tool in real traffic and the comparsion with the results of the existing tools, which differ from the new tool in some characteristics. The work discusses the advantages of using the MongoDB document-oriented database for website traffic monitoring purposes as well.
Implementation of Regular Expression Grouping in MapReduce Paradigm
Šafář, Martin ; Dvořák, Milan (referee) ; Kaštil, Jan (advisor)
The greatest contribution of this thesis is design and implementation of program, that uses MapReduce paradigm and Apache Hadoop for acceleration of regular expression grouping. This paper also describes algorithms, that are used for regular expression grouping and proposes some improvements for these algorithms. Experiments carried out in this thesis show, that a cluster of 20 computers can speed up the grouping ten times.
Big Data
Bútora, Matúš ; Bartík, Vladimír (referee) ; Hruška, Tomáš (advisor)
The aim of the bachelor thesis is to describe the Big Data issue and the OLAP aggregate operations. These operations are applied using Apache Hadoop technology. Most of the work is focused on the description of this technology. The last chapter contains application of aggregate operations and their implementation, following the conclusion of the work and the possibility for future development.
Application for Big Data
Blaho, Matúš ; Bartík, Vladimír (referee) ; Hruška, Tomáš (advisor)
This work deals with the description and analysis of the Big Data concept and its processing and use in the process of decision support. Suggested processing is based on the MapReduce concept designed for Big Data processing. The theoretical part of this work is largely about the Hadoop system that implements this concept. Its understanding is a key feature for properly designing applications that run within it. The work also contains design for specific Big Data processing applications. In the implementation part of the thesis is a description of Hadoop system management, description of implementation of MapReduce applications and description of their testing over data sets.
Scalable preprocessing of data using Hadoop tool
Marinič, Michal ; Šmirg, Ondřej (referee) ; Burget, Radim (advisor)
The thesis is concerned with scalable pre-processing of data using Hadoop tool which is used for processing of large volumes of data. In the first theoretical part it focuses on explaining of functioning and structure of the basic elements of Hadoop distributed file system and MapReduce methods for parallel processing. The latter practical part of the thesis describes the implementation of basic Hadoop cluster in pseudo-distributed mode for easy program-debugging, and also describes an implementation of Hadoop cluster in fully-distributed mode for simulation in practice.
Optimization of the Hadoop Platform for Distributed Computation
Čecho, Jaroslav ; Smrčka, Aleš (referee) ; Letko, Zdeněk (advisor)
This thesis is focusing on possibilities of improving the Apache Hadoop framework by outsourcing some computation to a graphic card using the NVIDIA CUDA technology. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model called mapreduce. NVIDIA CUDA is a platform which allows one to use a graphic card for a general computation. This thesis contains description and experimental implementations of suitable computation inside te Hadoop framework that can benefit from being executed on a graphic card.
Application for Big Data
Blaho, Matúš ; Bartík, Vladimír (referee) ; Hruška, Tomáš (advisor)
This work deals with the description and analysis of the Big Data concept and its processing and use in the process of decision support. Suggested processing is based on the MapReduce concept designed for Big Data processing. The theoretical part of this work is largely about the Hadoop system that implements this concept. Its understanding is a key feature for properly designing applications that run within it. The work also contains design for specific Big Data processing applications. In the implementation part of the thesis is a description of Hadoop system management, description of implementation of MapReduce applications and description of their testing over data sets.

National Repository of Grey Literature : 24 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.